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Abstract —A new scheme to sample signals defined in the nodes 
of a graph is proposed. The underlying assumption is that such 
signals admit a sparse representation in a frequency domain related 
to the structure of the graph, which is captured by the so-called 
graph-shift operator. Most of the works that have looked at this 
problem have focused on using the value of the signal observed 
at a subset of nodes to recover the signal in the entire graph. 
Differently, the sampling scheme proposed here uses as input 
observations taken at a single node. The observations correspond 
to sequential applications of the graph-shift operator, which are 
linear combinations of the information gathered by the neighbors 
of the node. When the graph corresponds to a directed cycle 
(which is the support of time-varying signals), our method is 
equivalent to the classical sampling in the time domain. When the 
graph is more general, we show that the Vandermonde structure 
of the sampling matrix, which is critical to guarantee recovery 
when sampling time-varying signals, is preserved. Sampling and 
interpolation are analyzed first in the absence of noise and then 
noise is considered. VVe then study the recovery of the sampled 
signal when the specific set of frequencies that is active is not 
known. Moreover, we present a more general sampling scheme, 
under which, either our aggregation approach or the alternative 
approach of sampling a graph signal by observing the value of 
the signal at a subset of nodes can be both viewed as particular 
cases. The last part of the paper presents numerical experiments 
that illustrate the results developed through both synthetic graph 
signals and a real-world graph of the economy of the United States. 


Index Terms —Graph signal processing. Sampling, Interpolation, 
Error covariance. Support selection 


I. Introduction 

Sampling (and subsequent interpolation) is a cornerstone 
problem in classical signal processing 0 - The emergence of 
new fields of knowledge such as network science and big data 
is generating a pressing need to extend the results existing for 
classical time-varying signals to signals defined on graphs Q- 
0 - This not only entails modifying the algorithms currently 
available for time-varying signals, but also gaining intuition on 
what concepts are preserved (and lost) when a signal is defined, 
not in the classical time grid, but in a more general graph 
domain. 

This paper investigates the sampling and posterior recovery of 
signals that are defined in the nodes of a graph. The underlying 
assumption is that such signals admit a sparse representation in 
a (frequency) domain which is related to the structure of the 
graph where these signals reside. Most of the current efforts in 
this field have been focused on using the value of the signal 
observed at a subset of nodes to recover the signal in the 
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entire graph 0-0- Our proposal in this paper is different. 
We present a new sampling method that accounts for the graph 
structure, can be run at a single node and only requires access 
to information of neighboring nodes. Moreover, we also show 
that the proposed method shares similarities with the classical 
sampling and interpolation of time-varying signals. When the 
graph corresponds to a directed cycle, which is the support 
of classical time-varying signals, our method is equivalent 
to classical sampling. When the graph is more general, the 
Vandermonde structure of the sampling matrix, which is critical 
to guarantee recovery in classical sampling ||T], is preserved. 
Such a structure not only facilitates the interpolation process, 
but also helps to draw some connections between the proposed 
method and the sampling of time-varying signals. Sampling and 
interpolation are analyzed first in the absence of noise, where 
the conditions under which recovery is guaranteed are identified. 
The conditions depend both on the structure of the graph and 
the particular node taking the observations. They also reveal that 
one way to understand bandlimited graph signals is to think of 
signals that can be well approximated by only observing the 
value of the signal at a small neighborhood. We then analyze 
the sampling and reconstruction process when noise is present 
and when the specific frequencies where the signal is sparse 
are not known. For the noisy case, an interpolator based on the 
Best Linear Unbiased Estimator (BLUE) is designed and the 
effect on the corresponding error covariance matrix of different 
noise models is discussed. Eor the case of unknown frequency 
support, we also provide conditions under which the signal can 
be identified. This second problem falls into the category of 
sparse signal reconstruction pO)- HD where the main idea is 
to leverage the structure of the observation matrix to facilitate 
recovery. The last contribution is the design of a generalization 
of our sampling method that considers a subset of nodes, each 
of them taking multiple observations. Within that generalization, 
the approach of sampling a graph signal by observing the 
value of the signal at a subset of nodes can be viewed as a 
particular case. Hence, the generalization will also be useful to 
compare and establish relationships between existing approaches 
to sample signals in graphs and our proposed method. 

The paper is organized as follows. Section |I^ introduces the 
new aggregation sampling method, compares it to the existing 
selection sampling method and shows that for classical time- 
varying signals both methods are equivalent. Section[In| analyzes 
our sampling method in more detail and applies it to sample 
bandlimited graph signals. The analysis includes conditions for 
recovery, which are formally stated in Section [Tll-C Section IV 


investigates the effect of noise in aggregation sampling. It also 
discusses how to select sampling nodes and observation schemes 
that lead to a good recovery performance. Corresponding mod¬ 
ifications in the interpolation in order to recover the signal 
when the support is not known are discussed in Section [V] 
Section VI proposes a generalization under which the existing 
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selection sampling and the proposed aggregation sampling can 
be viewed as particular cases. Several illustrative numerical 
results are presented in Section |V1I| A few concluding remarks 
are provided in Section |VIII| which closes the paper. 

Notation: Boldface capital letters denote matrices and boldface 
lowercase letters column vectors. Generically, the entries of a 
matrix X and a vector x are denoted as Xij and xf, however, 
when contributing to avoid confusion, the alternative notation 
[X]ij and [x]i will be used. The notations ^ and ^ stand 
for transpose and transpose conjugate, respectively; ® is the 
Kronecker product; trace(X) := the trace of the 

square matrix X and det(X) is its determinant; diag(x) is a 
diagonal matrix satisfying [diag(x)]ii = [xj^; vec(X) is the 
column-wise vectorized version of matrix X; e, is the i-th iV x 1 
canonical vector (all entries of are zero except the *-th one, 
which is one); Ek ■= [ei,...,eif] is a tall matrix collecting 
the K first canonical vectors; and 0 and 1 are, respectively, 
the all-zeros and all-ones matrices (when not clear from the 
context, a subscript indicating the dimensions will be used). The 
modulus (remainder) obtained after dividing a: by iV is denoted 
as modAr(a:). 


II. Sampling of graph signals 

Let Q = {Af,£) denote a directed graph. The set of nodes JV 
has cardinality N, and the set of links S is such that {i,j) G S 
if and only if node i is connected to node j. The set Afi : 
{j l(j)*) ^ contains all nodes with an incoming connection 
to i and is termed the incoming neighborhood of i. For any given 
graph we dehne the adjacency matrix A as a sparse N x N 
matrix with nonzero elements Aji if and only if {i,j) G S. The 
value of Aji captures the strength of the connection between i 
and j. When the graph is unweighted, the nonzero elements of 
A are set to one. The focus of this paper is not on analyzing 
Q, but a graph signal dehned on the set of nodes Af. Such a 
signal can be represented as a vector x = [xi,..., xat]^ G 
where the i-th component represents the value of the signal at 
node i, or, equivalently, as a function f : A/ —?■ R, dehned on 
the vertices of the graph. 

The graph ^ is endowed with a graph-shift operator S dehned 
as an A X A matrix whose entry (i, j), denoted as Sij, can be 
nonzero only if z = j or (j, z) G £. The sparsity pattern of 
the matrix S captures the local structure of Q but we make 
no specihc assumptions on the values of the nonzero entries 
of S. Common choices for S are the adjacency matrix of the 
graph g, ig, the Laplacian Q, and its generalizations ng. 
The intuitive interpretation of S is that it represents a linear 
transformation that can be computed locally at the nodes of the 
graph. If y = [z/i,..., is dehned as y = Sx, then node z 
can compute yi provided that it has access to the values of Xj 
at its incoming neighbors j G Afi. We assume henceforth that S 
is diagonalizable, so that there exists a A x A matrix V and a 
A X A diagonal matrix A that can be used to decompose S as 

S = VAV-^ (1) 

In particular, Q is true for normal matrices satisfying SS^ = 
S^S. In that case we have that V is unitary, which implies 
= V^, and leads to the decomposition S = VAV^. 

A natural dehnition of sampling for a graph signal is to 
introduce a fat A x A selection matrix C and dehne the sampled 
signal as Q 

( 2 ) 



Fig. 1: Sampling in the time domain as selection sampling on a directed 
cycle graph. The discrete time domain can be represented by a cyclic 
graph connecting node z to z -F 1 and node A to node 1. Using the 
uniform selection matrix C = [ei,e]v/ir+i, • ■ ■,ejv_iv/jr+il^ with 
A/A = 1/2 results in the selection of the signal values at odd indexed 
nodes xi, xa,..., xn-\. Observe that this sampling rule is independent 
of the structure of the underlying graph. 


If the matrix C is chosen as binary, i.e., with elements G 
{0,1}, has a single nonzero element per row, and at most one 
nonzero element per column, then the signal x is a selection of 
K out of the A elements of x. In such a case, the ratio K/N is 
the sampling rate of the signal. Uniform sampling amounts to 
setting C = [ei, ejv/if+i, ■ ■ •, e^v-Ar/if+i]^ and the selection 
of the hrst K elements of x is accomplished by setting C = 
■— [si) ■ • •) We remark that, in general, it is not clear 

how to choose good selection matrices C. This is in contrast 
to conventional sampling of signals in the time domain where 
uniform sampling is advantageous. 

An equally valid, yet less intuitive, dehnition is to hx a node, 
say z, and consider the sampling of the signal seen by this 
node as the shift operator S is applied recursively. To describe 
this sampling methodology more clearly, dehne the Lth shifted 
signal y*^*^ := S^x and further dehne the A x A matrix 

Y := (3) 

that groups the signal x and the result of the hrst A — 1 
applications of the shift operator. Associating the z-th row of 
Y with node z, we dehne the successively aggregated signal at 
z as Yi := (efY)^ = Y^e^. Sampling is now reduced to the 
selection of K out of the A elements (rows) of y^, which we 
accomplish with a selection matrix C [cf 0] 

y, := Cy, = C (Y^e,) • (4) 

We say that the signal y^ samples x with successive local 
aggregations. This nomenclature follows from the fact that y^^^ 
can be computed recursively as y^*) := and that the z-th 

element of this vector can be computed using signals associated 
with itself and its incoming neighbors, 

yP = E (5) 

feAi 

We can then think of the signal y^ as being computed locally 
at node i using successive variable exchanges with neighboring 
nodes. In fact, it is easy to show that z/^ can be expressed as a 
linear combination of the values of Xj at nodes j whose distance 
(number of hops) from node z is less than or equal to 1. This 
implies that the sampled signal y^ in Q is a selection of values 


X = Cx. 
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Fig. 2; Sampling in the time domain as aggregation sampling in a directed cycle graph. In aggregation sampling we utilize successive applications 
of a shift operator determined hy the given graph and sample the resulting signal observed at a given node. Using the cycle adjacency matrix 
Adc as shift operator results in the signal x rotating through the graph and the selection of elements of the aggregated signal reduces to 
conventional sampling. Aggregation sampling is, therefore, a generalization of conventional sampling to graph signals that utilizes the underlying 
graph structure in the construction of samples. 


that node i can determine locally. An underlying idea behind the 
sampling in Q is to incorporate the structure of the shift into 
the sampling procedure. Indeed, S and play key roles in 
other graph-processing algorithms such as shift-invariant graph 
filters Q, where the output of the hlter can be viewed a linear 
combination of the shifted signals 

To understand the difference between selection sampling [cf. 
@] and aggregation sampling [cf. Q], it is instructive to 
consider their application to a signal defined in the time domain. 
We do so in the following section. 


generalizations of conventional sampling. In general, selection 
sampling and aggregation sampling produce different outcomes. 
In selection sampling we move through nodes to collect samples 
at points uniquely identified by C, whereas in aggregation 
sampling we move the signal through the graph while collecting 
samples at a fixed node. Observe that because aggregation 
sampling depends on the shift operator, it incorporates the 
structure of the graph into the sampling procedure. This is not 
true for selection sampling except for the choice of matrices C 
adapted to particular graphs. 


A. Example: Sampling in the time domain 

Classical time domain signals can be represented as graph 
signals defined on top of a directed cycle graph 
illustrated in Fig. To do so define the directed cycle graph 
Qdc as one in which the edge set Sdc '■= {(u modAr(i) -F 
connects node i to node i + 1 for all nodes except N, which is 
connected to node 1. The elements of the adjacency matrix of 
this graph, denoted as Adc, are zero except for the ones in the 
first cyclic subdiagonal, which are one. 

For a signal x defined on top of the directed cycle Qdc, 
we consider selection sampling and aggregation sampling when 
using the shift operator S = Adc and the uniform selection 
matrix C = [ei, ejv/_R-+i,..., ejv_jv/if-i-i]^- Illustrations of the 
respective sampling procedures are available in Figs. [T] andfor 
a signal with N = 6 elements and sampling rate K/N = 1/2. 

In selection sampling we just multiply the graph signal x 
with the selection matrix C to obtain the sampled signal x = 
Cx as indicated by (|^. In aggregation sampling we consider 
subsequent applications of the shift matrix S = Adc- Each of 
these shifts amounts to rotating the signal clockwise so that 
the element at node i moves to node * -F 1 for all i < W and 
the element at node N moves to node 1. If we consider, e.g., 
node 1 = 1, the first shift moves signal xn lo this node so that 

= xn, the second shift moves signal Xn-i to this node so 
( 2 ) 

that y\ = xn-i and so on. It follows that the aggregated signal 
yi in ([^ is given by yi = [xi,xm, xjq-i ,..., X 2 ]- This is just 
a shift of the original signal x, which, upon multiplication by 
the selection matrix C as per 0 results in a vector yi = Cyi 
that contains the same elements that x contains. 

For the cycle graph and shift operator S = Adc selection 
and aggregation sampling produce not only equivalent sam¬ 
pled signals but also reduce to conventional sampling. This 
is not a coincidence because both methods are designed as 


III. Sampling of bandlimited graph signals 


Recovery of the original signal from its sampled version is 
possible under the assumption that the original signal admits 
a sparse representation. This section begins by introducing 
the concept of a bandlimited graph signal, which is sparse in 
the frequency domain, and establishing some connections with 
the concept of bandlimitedness in the classical time domain. 


Section III-B reviews briefly the recovery of a bandlimited 
graph signal for the case of selection sampling. Section |III-C| 
analyzes the recovery of a bandlimited graph signal for the case 
of aggregation sampling. 


A. Bandlimited graph signals 

The common practice when addressing the problem of sam¬ 
pling signals in graphs is to suppose that the graph-shift operator 
S plays a key role in explaining the signals of interest x. More 
specifically, that x can be expressed as a linear combination of 
a subset of the columns of V = [vi,..., v^r], or, equivalently, 
that the vector x = V“^x is sparse. In this context, vectors 
Vfe are interpreted as the graph frequency basis and Xk as 
the corresponding signal frequency coefficients. To simplify 
exposition, it will be assumed throughout the paper that the 
active frequencies are the hrst K ones, which are associated with 
the largest eigenvalues Q, fTh) . Under this assumption, it holds 
that X = [xi,..., Xif, 0,..., Op'^. However, the results presented 
in the paper can be applied to any set of active frequencies K. of 
size K provided that JC is known. For convenience, we define 
S/ K '■= [vi,...,Vif] and SLk '■= \xi, ...,xkY' so that we may 
write X = [x^ | Oix^v-if]^- For x to be sparse, it is reasonable 
to assume that S is involved in the generation of x. 

When Q = Qdc, it can be easily shown that setting the shift 
operator either to S = Adc or to S = Tide '■= I — gives rise 
to the Fourier basis F. More formally, that the right eigenvectors 
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of S satisfy V = F, with Fij := 

j := y/—l. Selecting S = A^c has the additional advantage 
of satisfying An = i.e., the eigenvalues of the 

shift operator correspond to the classical discrete frequencies. 
Interpretations for the eigenvalues of the Laplacian matrix 
also exist Q. 


B. Selection sampling of bandlimited graph signals 

Under the selection sampling approach Q-|9j, sampling a 
graph signal amounts to setting x = Cx [cf. ©]■ Since the 
K X N binary selection matrix C indexes the nodes that are 
observed, the issue then is how to design C, i.e., which nodes to 
select, and how to recover the original signal x from its samples 

X. 

To answer these questions, it is assumed that the signal x is 
bandlimited, so that it can be expressed as a linear combination 
of the K principal eigenvectors in V. The sampled signal x 
is then x = Cx = CVk^k- Clearly, if the matrix CVk 
is invertible, then xk can be recovered from x. Once the 
coefficients xx are known, the signal in the original domain can 
be found as x = Vk^k- Combining the previous equations, we 
have 

x = Yk^k=Vk{CYk)-^x. ( 6 ) 

The expression in (|^ shows how the original signal can be 
interpolated from its samples. For the previous equation to 
hold true, the matrix CYx has to be invertible. Hence, the 
key for guaranteeing perfect signal reconstruction is to select a 
subset of nodes such that the corresponding rows in are 
linearly independent. In the classical domain of time-varying 
signals, the (Fourier) basis has a Vandermonde structure, both 
row-wise and column-wise. This readily implies that any subset 
of K rows will give rise to a (row-wise) Vandermonde matrix 
and, hence, invertibility is guaranteed. However, for an arbitrary 
graph this is not guaranteed and algorithms to select a specific 
subset that guarantees recovery are required Q. The role of the 
Vandermonde structure of the sampling matrix will be analyzed 
in more detail in the ensuing sections. 


C. Aggregation sampling of bandlimited graph signals 

As explained in under the aggregation approach the 
sampled signal is formed by observations of the shifted signals 
y(0 — s*x taken at a given node i. Under this second approach, 
the graph-shift operator S plays a key role not only in explaining 
and recovering x, but also in sampling x. Another reason to 
consider this scheme is that the entries of can be found 
by sequentially exchanging information among neighbors. This 
implies that; a) for setups where graph vertices correspond to 
nodes of an actual network, the procedure can be implemented 
distributedly; and b) if recovery is feasible, the observations at 
a single node can be used to recover the signal in the entire 
graph. 

Mimicking the approach in the previous section, we first ana¬ 
lyze how the bandlimitedness of x is manifested on the sampled 
signal. Then, we identify under which conditions recovery is 
feasible and describe the corresponding interpolation algorithm. 
For the ease of exposition, the dependence of y^ on x is given 
in the form of a lemma. 


Lemma 1: Define the N xl vector Vi := V^e^, which collects 
the values of the frequency basis at node i, and the 

N X N (column-wise) Vandermonde matrix 


H, ■- 


( 


1 

Ai 



V A 


N-1 

1 


A 


N-l 

N 


/ 


(7) 


Then, the shifted signal y^ can be expressed as 


y, = ’®'diag(t;,)x. (8) 

Proof: Using the spectral decomposition of S, signal y*^9 can 
be written as 


y(') = S'x = (VA'V-i)x = (VA')x. (9) 


Based on the definitions of y^ and Vi, it follows that 

y, = = [YY-^Yfe, 

= (V-iY)^V^e, = (V-iY)^u,. (10) 

Since the Z-th column of matrix Y is it can be 

written as (VA^“^)x [cf. Hence, the Z-th column of 

matrix (V“^Y) can be written as A*“^x or, equivalently, as 
diag(x)[A5^“^,..., Leveraging the fact that the vector 

containing the Z-th power of the eigenvalues corresponds to the 
row Z -f 1 of matrix \P, the shifted signal y^ can be expressed 
as 


y, = (V-iY)^t., = (diag(x)^^)^w, 

= ’Fdiag(x)r:i = ^diag(r»j)x, (11) 

which is the claim in the lemma. ■ 

Notice that while in Section [riI-B| the relationship between the 
sparse frequency coefficients x and the signal to be sampled was 
simply given by x = Vx, now it is given by y^ = ’®'diag(i;i)x. 

Next, we use Lemma [T] to identify under which conditions 
recovery is feasible. To do this, let us define the N x K matrix 
= ’®'diag(r:i)E^. Then, the sampled signal y^ is 

Yi = Cyi = C'Fdiag(r»i)x = C'^iXx, (12) 

where C is the binary K x N selection matrix, and xx the 
vector collecting the non-zero components of x. To simplify 
exposition, for the time being we will assume that C — i.e., 

that the observations correspond to the original signal and the 
first K—1 shifts. This assumption can be relaxed, as discussed 
in Remark [T] 

If matrix C'5'i is invertible, then xx can be recovered from 
y. [cf. ([T2|] and, once Xx is known, x can be found as x = 
Yx^K- Combining the previous expressions, we have [cf. 

x = Yx^K = ^K{C^^)-^y^■ (13) 

The expression in ( [T3| ) shows how the original signal can be 
interpolated from its samples. As already stressed, for the pre¬ 
vious equation to hold true, the matrix C\Pi has to be invertible. 
Hence, the key for guaranteeing perfect signal reconstruction is 
to select samples such that the corresponding rows in ’4'^ are 
linearly independent. While for the selection sampling described 
in Section |III-B| there is no straightforward way to check the 
invertibility of CYx (existing algorithms typically do that by 
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inspection Q), for the aggregation sampling described in 
( [T 3 I 1 the invertibility of can be guaranteed if the conditions 
presented in the following proposition hold. 

Proposition 1: Let x and y, be, respectively, a bandlimited 
graph signal with at most K non-zero frequency components 
and the output of the sampling process defined in ( |12| l. Then, 
the N entries of signal x can be recovered from the K samples 
in Yi if the two following conditions hold 

i) The first K eigenvalues of the graph-shift operator S are 
distinct; i.e., Xi f Xj for all i j, i < K and j < K. 

ii) The K first entries of Vi are non-zero. 

Proof: To prove the proposition it suffices to show that i) and 
ii) guarantee the invertibility of C'5'i [cf. ([T3])]. Matrix C’®'i 
can be understood as the multiplication of two matrices: matrix 
and matrix (E^diag(t^i)E;f). It is immediate that 
condition ii) guarantees that the second matrix is invertible. 
Moreover, condition i) guarantees invertibility of the first matrix. 
To see this, note that (’®'E;f) is a N x K (column-wise) 
Vandermonde matrix. Hence C(\PEif) is a selection of the first 
K rows of ('i'E^), which is also Vandermonde. Any square 
Vandermonde matrix has full rank provided that the basis (i.e., 
the eigenvalues of S) are distinct, as required in condition i). ■ 

One of the implications of the proposition is that there is no 
need to compute or observe the entire vector y^, since its first 
K entries suffice to guarantee recovery. 

The conditions in Proposition are not difficult to check and 
they provide additional insights on the behavior of the sampling 
and interpolation procedure. Condition i) refers to the structure 
of the entire graph. It states that if a graph has two identical 
frequencies and the signal of interest is a linear combination 
of both of them, then the K x K matrix (CtPE/f) cannot be 
inverted and the sampling procedure will fail, regardless of the 
chosen node. Note that this problem is not present in classical 
sampling of time-varying signals, because the eigenvalues of 
the Fourier Vandermonde matrix associated with S = Adc are 
always distinct. Condition ii) refers to the specific node where 
the samples of the shifted signal are observed. It basically states 
that any node in the network can be used to sample the signal 
provided that (ejvi) ^ 0 for fc = 1 ,..., iT; i.e., that the chosen 
node participates in the specific frequencies on which signal x 
is expressed. It also points to the fact that if |e^Ui| are non¬ 
zero but small, selecting i as the sampling node may give rise to 
interpolations that are potentially unstable if noise is present; see 
Section [Tv| For the particular case when S = Adc, condition ii) 
is satisfied since all the entries of the Fourier basis are non-zero. 


in Proposition [2. This implies that a node can reconstruct a 1- 
bandlimited signal based solely on the value that this signal takes 
at the node. For the case of a 2-bandlimited signal where x = 
aivi -|-Q! 2 V 2 , Proposition [T] guarantees reconstruction based on 
[yi]i and [yi] 2 , which only contain information about the signal 
at node i and at its neighbors. Therefore, one can understand 
bandlimited graph signals as signals that can be identified locally 
by relying on observations within a given number of hops. Note 
that this does not necessarily imply that the variation of the 
signal among close-by nodes is small, it only means that the 
pattern of variation can be inferred just by looking at close- 
by nodes. This discussion will be revisited in Section |V] For 
the recovery to be implemented locally too, the nodes need to 
know Vif and {Xk]^^i, i.e. the structure of the graph where 
the signal resides. 

We may decompose the interpolator V/f in ( [T3] l into 

three factors Vj^(E^diag('Ui)Ej^)“^(C\PEj^)“^ to reveal that 
it can be computed in closed-form, since a non-zero diagonal 
matrix can be trivially inverted and closed-form expressions 
for the inverse of a Vandermonde matrix exist tni- Moreover, 
notice that one of these three factors is related to the structure 
of the graph and the support where the signal is bandlimited 
V^; one is related to the structure of the graph, the support of 
the signal and the subset of observations and the 

third one depends on the specific node where the samples are 
taken (E^diag('Ui)EK 

Remark 1: The structure of the selection matrix C and, in 
particular, the fact that C’Q/Ejf is a Vandermonde matrix are 
instrumental to guarantee the recovery of the original signal. 
Note that C'S'E^f is Vandermonde not only when C = E]^, 
but also when C = [ei, ei+jvo, ■ • ■, en_(;^_i)7Vo]^, provided 
that 1 < Nq < N/K and X^° f X^° for all ki f k^, where 
ki < K and ^2 < K. By setting Nq = N/K, the counterpart of 
the classical time sampling theorem (which considers uniformly 
spaced samples) is recovered. Moreover, if none of the frequen¬ 
cies of interest is zero (i.e., if Afe 7 ^ 0 for k < K), then selection 
patterns of the form C = [e^^, e„„+Ar„, ..., are 

also guaranteed to lead to invertible matrices. In this case, the 
resultant matrix is not Vandermonde, but it can be expressed as a 
product of a Vandermonde and a non-zero diagonal matrix. For 
reference in the following sections, we define here the K x 
N matrix (tt-q, Ajj) := [cnoj ®no-i-Afoj ■ ■ • ? 
and the set of admissible K x N selection matrices Ck ■= 
{CK{no,No) I A^o = 1,...,A^/A: and tiq = 
No{K-l)}. 


D. Discussion 

Suppose that we know that x is indeed AT-bandlimited; i.e., 
that it can be expressed as a linear combination of the K first 
frequency basis vectors vi,... ,'Vk- Then, Proposition [T] states 
that a single node, say the i-th one, can reconstruct the entire 
graph signal just from its own signal x; and K — 1 exchanges 
with its neighbors. Note that one of the consequences of this 
result is that linear combinations of signals at nodes that are 
in a neighborhood of radius K — 1 suffice to reconstruct the 
entire graph signal. To be specific, suppose that x = avi, which 
represents the extreme case of a 1-bandlimited signal. Then, it 
follows that [yi]i = x; = a[vi]i, from where a can be found 
- and X reconstructed - as long as f 0 [cf. condition ii) 


IV. Sampling and interpolation in the presence oe 

NOISE 


When sampling in the absence of noise, two main questions 
are how to recover the signal from its samples and the conditions 
under which recovery is feasible. When the samples are noisy, 
perfect reconstruction is, in general, unfeasible and new issues 


arise. In Section IV-A we estimate the noisy signal through 


interpolation via the Best Linear Unbiased Estimator (BLUE) for 


a general noise model. In Section IV-B we specify noise models 


that are likely to arise in graph domains. Then, in Section IV-C 


we discuss the effect on the interpolation error of selecting the 
sampling node and the rows of the selection matrix. 
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A. BLUE interpolation 

Consider now that the shifted sampled signal is coiTupted 
by additive noise, so that the observed signal is given by 
Zi = Yi + Wi. The noise is assumed to be zero-mean, 
independent of the graph signal, and colored with a covariance 
matrix := E[wiwf^|. For notational convenience, we dehne 
also Wi = Cwi and rIJ = CR^^C^. 

Key to design the interpolator in the presence of noise is to 
notice that the relation between the observed samples z^ and the 
original signal x is given by 


and the inverse of the trace of its inverse 


64 := 


trace 


(r«- 


-1 


(24) 


Notice that the etTor metrics 63 and 64 are computed based on 

(i) 

the error covariance matrix for the frequency estimator Re 
instead of the time estimator since R^*^ is a singular matrix [cf. 
@]. 


B. Noise models 


Zi = C'i/.XK + (14) 

X = Vk^k- (15) 

The BLUE estimator of xk, which minimizes the least squares 
etTor, is given by m 

x^ = ('i'fc"(RW)-iC’®'.)”''®'fC^(RW)-iz,, (16) 

provided that the inverse in ( [T6| ) exists. Additionally, for the 
particular case of Gaussian noise in ( [T4l i, the estimator in 
([H coincides with the Minimum Variance Unbiased (MVU) 
estimator which attains the Cramer-Rao lower bound. In this 
case, it also holds true that the inverse of the error covariance 
matrix associated with corresponds to the Fisher Informa¬ 
tion Matrix (FIM) mi- Clearly the larger the number of rows 
in d. the better the estimation is. When the selection matrix 
C selects exactly K rows (and not more), reduces to 

X? = (17) 

_ __ j__ 

After obtaining x^ - either via ( [16) or ( |17) -, the time signal 
recovered at the i-th node x^®! can be found as 

x«=Vkx1^\ (18) 


Finally, the error covariance matrices for the frequency and time 

estimators R^*^ := E[(x;f — xJ)(xk — and R^*^ := 

E[(x - x«)(x - xW)^] are @ 


R« = ('I>fC^(R«)-iC'®',) \ (19) 

R« = V^R-l^^Vf. (20) 


Note that the etTor covariance matrix Re depends on the 
noise model, the frequencies of the graph (eigenvalues of the 
shift operator), the node taking the observations, and the sample- 
selection scheme adopted (cf. Remark [T]). 

The etTor covariance matrix enables us to assess the per¬ 
formance of the estimation. Smaller etTors give rise to better 
estimators. However, there exist multiple alternatives to quantify 
the etTor, as analyzed by the theory of optimal design of 
experiments | [T9) . The most common approach is to hnd an 
estimator which minimizes the trace of the covariance matrix 


ei := trace(R^®^), (21) 


which corresponds to the minimization of the Mean Square 
Error (MSE). Other common error metrics based on the error 
covariance matrix are the largest eigenvalue 

e2:=A„,ax(Ri*^), (22) 

the log determinant 

e3:=logdet(RW), (23) 


The results presented so far consider a general error covari¬ 
ance matrix R^^^ so that they can be used regardless of the 
color of the noise. In this section, we present three particular 
examples of interest. 


White noise in the observed signal z^. This implies that 
is white and therefore = a^l, with cr^ denoting the 
noise power. In this case, the KxK matrix Riu is given 
by 


Ri() = 


(25) 


• White noise in the original signal x. With w denoting the 
white additive noise present in x, we can use the linear 
observation model to write = \Irdiag(ui)V“^w. Then, 
the N X N error correlation matrix is simply given by 
R«j^ = cr^\Fdiag('Ui)V“^(V“^)'^diag(ui)'^’i'^. When 
the shift is a normal matrix, V is unitary and the previous 
expression reduces to R^^^ = \I/|diag('Lii)p\I/^. As 
before, the KxK error correlation matrix is obtained just 
by selecting the rows and columns of the former, 

RW =cr2C^|diag(wi)|2’5r^C^. (26) 


The previous expressions show not only that the noise is 
correlated, but also that the correlation depends on the 
graph structure (eigenvalues and eigenvectors of S), the 
node collecting the observations, and the specihc selection 
of observations. 

• White noise in the active frequency coefficients xk- With 
w/f denoting the white additive noise present in xk, 
we can use the linear observation model to write = 
’®'diag(r>i)EifWif =It follows that the NxN and 
KxK error covariance matrices are R^ ^ and 

R« = C^. (27) 


This model can be appropriate for scenarios where the 
signal of interest is the output of a given “graph process” 
- e.g., a diffusion process - and the noise is present in the 
input of that process. This noise model can also arise when 
the signal to be sampled has been previously processed with 
a low-pass graph hlter 

There are many other noise models that can be of interest 
in graph setups. For example, the error covariance can be a 
linear combination of the previous covariance matrices (noise is 
present in both the original signal and the observation process). 
Alternatively, the noise at a specific node can be also rendered 
dependent on the number of neighbors. This last situation 
would be reasonable, for example, in distributed setups where 
the information of neighboring nodes is exchanged via noisy 
channels. 
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C. Selection of the sampling set 

The two elements that define the set of samples to be 
interpolated are: the sampling node, i.e., the node i which 
aggregates the information; and the sample-selection scheme, 
i.e., the elements of selected by C. 

1) Selection of the sampling node: The recovery results in 
Section III-C show that any node i can be used to sample 
and recover the entire graph signal, provided that the entries 
of Vi corresponding to the active frequencies in x are non-zero. 
However, when noise is present, the error covariance matrix 
- which is the key element to evaluate the quality of 
the interpolation - is different for each i. In this context, it 
is reasonable to select as a sampling node one leading to a 
small error. Note that selecting the best one will only require 
the computation of N closed-form expressions which involve 
matrix inversions. In scenarios where computational complexity 
is a limiting factor, the structure of the noise correlation, as well 
as the structure of the interpolation matrix, can be exploited to 
reduce considerably the computational burden. E.g., for the case 
where white noise is present in the active frequency coefficients, 
when substituting into and ( |20l l, it follows that 


= a^I, R« = 


k^k- 


(28) 


feasible recovery according to the conditions stated in Proposi¬ 
tion [T] If C is not constrained to belong to Ck, the number of 
candidate matrices is N choose K. However, Ck has a much 
smaller cardinality: Nq can take at most N/K values, and uq 
at most {N — Nq{K — 1)). Moreover, as it was the case for 
the sampling node selection, in some cases the noise structure 
can be exploited to readily determine the optimal observation 
strategy. E.g., for the case where white noise is present in the 
active frequencies, it is immediate to see that the performance is 
independent of the sample-selection scheme [cf. @1. Eor the 
case where white noise is present in the observed signal, let us 
assume that the selection matrix is given by C = Ci<-(no, A^o) 
where Nq is fixed and we want to design ng. 

If we adopt eg in ( |2^ as our error metric, the goal is to 
find the value rip that minimizes det(Ri*^). To achieve this, 
consider two different selection matrices Ca = CK{no, Nq) 
and Cb = C/f(no + l,No). Using ( |29] l and assuming without 
loss of generality that = 1, the error covariance for is 
given by R^*^^ = (Ef diag('Lii)^^^Cf CB’i'diag(uj)EK)“^ 
A similar expression can be written for R^ Since is 
Vandermonde, it is not difficult to show that can 

be written as This implies that 


Consequently, for this particular noise model, the estimator 
performance is independent of the node choice. This is true 
for every error metric [cf. (|2T|i-(|24li]. The result is intuitive: 
given that the noise and the signal are present in the same 
frequencies, it is irrelevant if a node amplifies or attenuates a 
particular frequency. Differently, if the white noise is present in 
the observed signal, we can substitute ( |25] l into ( [T^ to obtain 

R« =a2(Efdiag(t7,)^'5'^C^C^diag(u,)E^f)-\ (29) 


Thus, if we are interested in minimizing, e.g., the error metric 
64 [Cf. (iigi], our objective may be reformulated as finding the 
optimal node i* such that 


j* = argmax trace ^E^diag(i;i)^’®'^C^C’S'diag(iJi)Eif ^ . 

(30) 

Eor a selection matrix of the form C = CK{no,No) (cf. 
Remark [^ 1 , the A:-th diagonal element of the matrix in ( |30l l can 
be written as |[t7i]fep J2m=o (n-o+mNo)^ 'pjjg simply 

the sum of those elements, so that, using the closed form for a 
geometric sum, (|30ll can be rewritten as 


i* = arg max 

i 


K 




|A/cP”“ - |Afe|2("o+^o^) 

' l-|AfeP^o ' 


(31) 


Thus, the optimal sampling node i* will be one with large 
values of for the active frequencies k < K. The relative 

importance of frequency k is given by the fraction in pT] ), which 
depends on the modulus of the associated eigenvalue and the 
structure of the selection matrix C (values of uq and Nq). 

2) Design of the sample-selection scheme: The error covari¬ 
ance matrix, and hence the different error metrics presented in 
depend on the selection matrix C. By changing C one 
can tradeoff the quality of a given sample and the detrimental 
effect of the corresponding noise. The specific set of samples 
that minimizes the error will in general depend on the error 
metric chosen. 

Recall that any matrix in the set of admissible selection 
matrices Ck defined in Remark [T] is guaranteed to lead to a 


' = Ef A^diag(r7,)^'®'^C^CA’®'diag(r7,)AE^f 
= (Eg A^E^jR^'^Ef AE;^). (32) 

Eor the first equality we have used that the product of diagonal 
matrices is commutative and for the second one that right and 
left multiplying by the canonical matrix amounts to selecting 
the columns and rows of the multiplied matrix. Using ( |32] i, we 
have that 

K 

det(R(*^) = det{R%) [] jA^p, (33) 

k^l 

which results in the following optimal strategy for the solution 
of 63: if nf=i |AfcP < 1 then Uq = 1, otherwise rig should be 
as large as possible; see Remark Equivalently, the optimal 
strategy states that if an application of the shift operator has an 
overall effect of amplification in the active frequencies, then we 
should aim to apply it as many times as possible, whereas if the 
opposite is true, we should avoid its application. 

Needless to say, one can also look at selection matrices that 
are not always guaranteed to lead to a Vandermonde structure, 
i.e., matrices not in Ck- In that case, the problem can be 
formulated as a binary optimization over C, which is typically 
challenging. If the size of the space search (N choose K) is 
not too large, the problem can be solved by exhaustive search - 
first by checking that the matrix guarantees recovery and then 
evaluating the corresponding error covariance. Eor more general 
cases, a reasonable approach is to formulate the problem, relax 
it, and exploit the problem structure to find a good approximate 
solution efficiently. The problem formulation and the subsequent 
relaxation will depend on the specific optimality criteria. Al¬ 
though of interest, developing approximate algorithms to design 
the selection matrix C is out of the scope of this paper and is 
left as future work. 

It is worth stressing that the sample-selection scheme that 
minimizes the error does not have to be the same for all nodes. 
Both the selection of the sampling node and the sampling shifts 




can be combined to obtain the best local reconstruction across 
all nodes in the graph. 

Remark 2: Designing C entails the selection of a subset of 
K entries out of the N entries in y^. However, has only 
N entries because Y has only N columns [cf Strictly 
speaking, there is no need to impose this restriction and more 
columns could be added to Y. As a matter of fact, if for a 
given noisy graph signal the application of the shift operator S 
attenuates the noise while amplifying the signal, the sampling 
procedure will benefit from further applications of S, even 
beyond the size of the graph N. In practice, the maximum 
number of applications will be limited by the computational 
and signaling cost associated with the application of the shift. 

V. Identifying the support of the graph signal 

In the previous sections, it has been assumed that the fre¬ 
quency support of the bandlimited signal corresponded to the 
K principal eigenvectors, which are the ones associated with 
the largest eigenvalues. However, the results presented also hold 
true as long as the basis support, i.e., the frequencies that are 
present in x, are known. To be specihc, let K, := {ki,..., 
denote the set of indexes where the signal x is sparse and, based 
on it, define the N x K matrices := [vfci,..., and 
Eye := [ofcj,..., Gfc^]. Then, all the results presented so far 
hold true if 'Vk is replaced with Vyc, and Ek, when used to 
select the active frequencies, is replaced with Eye. 

A related but more challenging problem is to design the sam¬ 
pling and interpolation procedures when the frequency support 
JC is not known. Generically, this problem falls into the class 
of sparse signal reconstruction ig, ig, |Tg. However, the 
particularities of our setup can be exploited to achieve stronger 
results. In particular, for the sampling procedure proposed in this 
paper, the so-called sensing matrix - the one relating the signal 
of interest to the observed samples - has a useful Vandermonde 
structure that can be exploited. 


A. Noiseless joint recovery and support identification 


Consider the noiseless aggregation sampling of Section III-C 


where we know that x is AT-sparse but we do not know the 
support of the K non-zero entries [cf. @1 


= Cy* = C'5fdiag(Uj)x. 


(34) 


For the case where the support is known, it was shown that a 
selection matrix C that picks the hrst K rows of ^ is enough 
for perfect reconstruction (cf. Proposition [^. 

If we reformulate the recovery problem as 

X* :=argnnn ||x||o (35) 

X 

s.t. Cyi = C^diag(ui)x, 

for the unknown support case, there is no guarantee that the 
solution X* coincides with the iC-sparse representation of the 
observed signal. When the frequency support is known, and 
provided that the selection matrix satishes the conditions in 
Remark[^ selecting K rows of \Pdiag(uy)Eyc leads to a (one- 
to-one) invertible transformation. When the support is unknown, 
guaranteeing identifiability requires selecting a higher number 
of rows (samples) |T§. The following proposition states this 
result formally. To simplify notation, the proposition assumes 


that K < N/2, but the result holds true also when that is not 
the case. 

Proposition 2: Let x and C be, respectively, a bandlimited 
graph signal with at most K non-zero frequency components and 
a selection matrix with 2K rows of the fonn C = C 2 _R-(no, Nq) 
(cf. Remark^. Then, if all the entries in Vi are non-zero and all 
the eigenvalues of S are non-zero and satisfy that f 
for all k k', it holds that 

i) the solution to •HD is unique; and 

ii) the original graph signal can be recovered as x. = Vx*. 

Proof: The proof proceeds into two steps. The first step is to 
show that any selection of 2K columns of the 2K x N matrix 
M := C’®'diag(ui) has rank 2K and, hence, it leads to an 
invertible 2Kx2K matrix. To prove this, let A" = {/i,..., f 2 K} 
be a set with cardinality 2K containing the indexes of the 
selected columns and dehne the N x 2K canonical matrix 
Ejr = [e/j,... Using this notation, the matrix contain¬ 

ing the columns of M indexed by is MEjr, which can be 
alternatively written as 


ME^ = C'5'diag(u.0E^ = (C’®'E^)(Ejdiag(u,)E^). 

(36) 

The expression reveals that MEjr is invertible because it can 
be written as the product of two 2K x 2K invertible matrices. 
The latter is true because: a) conditions C = C 2 ir (no, A^o)> 
for ^11 ^ ^ ^rid Afc 7 ^ 0 for all k guarantee that 

(C’i'Ejr) is invertible because it is a product of a diagonal 
and a full rank Vandermonde matrix (cf. Remark [T]) and b) 
condition \vj\k f 0 for all k guarantees that (Ej(diag(ui)Ejr) 
is an invertible diagonal matrix. This is true for any J^. The 
second step is to show that 2K observations are enough to 
guarantee identihability. To see why this is the case, assume 
that two different feasible solutions x^ and x^ exist. This would 
imply that M(xa—= 0. Nevertheless, the vector (Sca—xb) 
has, at most, 2K non-zero components and any choice of 2K 
columns of M generates a full rank square matrix which forces 
Sc A = Scb, contradicting the assumption of multiple solutions. 
Finally, it is worth noting that although the proposition requires 
all the eigenvalues to be non-zero and distinct, only the ones 
associated with K, need to satisfy those requirements. Note that 
the previous proof amounts to say that the matrix C’®'diag(ui) 
has full spark and, hence, the claims in the proposition coincide 
with those in ID for the 0 -norm recovery. ■ 


It is worth stressing that the proof for the joint recovery 
and identification support leverages once more the fact that 
is a Vandermonde matrix, which is a distinct feature of 
the aggregation sampling scheme proposed in this paper. To 
gain more intuition about the result, we revisit the discussion 
provided after Proposition [T] and suppose that we know that 
X is a bandlimited signal with only one non-zero frequency 
component. This means that K = 1 and that the graph signal 
can be written as x = avfc. If the value of k is known, 
which amounts to say that the support where the signal is 
sparse is known, then node i can interpolate the entire signal 
X using Xi (cf. Section III-D i. If the support is not known. 
Proposition establishes 2K = 2 samples are needed. Clearly, 
one sample is not enough because Xi = admits N 

different solutions, one per k. To see why two samples suffice, 
note that the first shift corresponds to [y^Ji = x; and the 
second to [yi ]2 = StiX; -f SijXj = A^t*. Then, node 

i can identify hrst the active frequency by hnding the frequency 
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index k whose associated eigenvalue is [yi] 2 /[yi]i- For the 
identification to succeed, the eigenvalues need to be distinct, as 
required by Proposition Once the active frequency is known, 
the corresponding frequency coefficient can be estimated as 
before by setting a = Xi/[v^]i and then the entire graph signal 
is X = avj^. This discussion provides additional support to the 
idea that bandlimited graph signals can be understood as signals 
that can be inferred from local information. 

From a computational perspective, the presence of the 0-norm 
in ( |T5l l renders the optimization non-convex, thus challenging 
to solve. A straightforward way to convexify it is to replace 
the 0-norm with a 1-norm. Note that if such a process finds 
a feasible solution, call it xj, such that ||x^||o = K, then it 
holds that x* = x^. Conditions under which this process is 
guaranteed to identify the support can be found by analyzing 
the coherence and the restricted isometry property (RIP) of the 
matrix C\Pdiag(r9i) |10|, ED- Unfortunately, determining the 
conditioning of all submatrices of a deterministic matrix (and, 
hence, the RIP) is challenging HD- The coherence of the matrix 
C^diag(r9i), denoted as /ii(C), is easier to find and it depends 
on the most similar pair of eigenvalues of S. However, the 
sparsity bound given by the matrix coherence, which requires 

is oftentimes too restrictive. A 


0 

;harc 


112 


^ — 2 |Ui(C)) 

better alternative in th'aTcase is to use the concept of f-averaged 
mutual coherence and apply the results in for deterministic 
sensing matrices. 


B. Noisy joint recovery and support identification 

If noise is present and the frequency support of the signal 
is unknown, the (AT-sparse) least squares estimate of x can be 
found as the solution to the following optimization problem 

X* :=argmm||(R«)-i/2(Cy, - Cvlrdiag(t9,)x) ||2 (37) 

X 

s.t. ||x||o<Ar 

where the matrix multiplication in the objective ac¬ 

counts for the fact of the noise being colored. As in the noiseless 
case, a straightforward approach to convexify the problem is 
to replace the 0-norm with the 1-norm and solve the problem 
xj; := arg mins li (RI*(Cyi - C’®'diag('Uj)x) ||i -f 7 I |x| 1 1 
for different values of the parameter 7. 

The challenges for support identification and the penalty paid 
in terms of error performance are related to those in the previous 
sections |fT^-|fT^. If the conditioning of matrix C\I/diag(r9i) is 
poor, which depends heavily on how similar the eigenvalues 
in A are, the performance will be bad. Bounds can be found 
using the coherence of C’®'diag(i;i), which is tractable, or by 
analyzing its RIR The results in fl^ for deterministic matrices 
can also be used here. An alternative to have performance 
guarantees in this case is to consider that the matrix C'5'diag('Lii) 
is random. This can be the case if, for example, C is designed 
as random or if there is noise in the application of the shift 
operator S. 


VI. Space-shift sampling of graph signals 

This section presents an alternative - more general - sam¬ 
pling setup that combines the selection sampling presented 
in Section III-B| with the aggregation sampling proposed in 
Section IIII-Cl 


Let us start by defining Z = Y -f W as the noisy counterpart 
of Y [cf. (|^]. Clearly, - the observed shifted signal at node 
i - corresponds to a row of matrix Z. We are now interested in 
collecting samples at different nodes and shifts, i.e., we want to 
sample matrix Z. To do so, we first define the vectorized version 
of Z as z = vec(Z^). Recall that signal z^ can be related to x^ 

via Zi = ’®'E_R-diag(wi)xif-f Wi [cf ([T^], where Vi = ViEiK- 
To write a similar equation relating z to x^, we need to define 

the X N matrix T and its corresponding reduced NK x K 
matrix Y as 


Y := 


/ diag(r9i) 
\ diag(uAr) 


Y := 


/ diag('Hi) 
\ diag('i}jv) 


(38) 


Based on this, z can be written as 

z = 0 (’S'Eif Yxif -f w, 


(39) 


where w is a vector of length obtained by concatenating 
the noise vectors for all nodes i. This implies that ( [39] l 
is a system of Y^ linear equations with K < N variables. 
Thus, our objective is to pick K of these equations in order to 
estimate xk - and, hence, x through ( [T8| ) - while minimizing 
the error introduced by the noise w. Notice that if we restrict 


ourselves to pick K equations out of the Y equations in 
positions (z — 1) Y-f 1 to i Y for some node i G {1,2,..., Y}, 
then the problem reduces to local aggregation sampling at node i 
as developed in Section |III-C| Similarly, if we restrict ourselves 
to pick the K equations out of the Y equations in positions 
[1,1 -f Y, 1 -I- 2 Y, ..., 1 -I- ( Y — 1) Y], then the problem reduces 
to selection sampling as presented in Section III-B In this 
sense, the formulation in ( |39| ) is more general. To implement 
the selection of the K equations out of the Y^ options in 


we use a binary selection matrix C as done in previous sections 
but, in this case, the size of C is Y x Y^. The reduced square 
system of linear equations can then be written as [cf. ([11] 


z = c(l(g) ('®'E;f))YX^-f Cw. (40) 


The correlation matrices of the frequency Re and time Rg errors 
of the estimator computed as the solution of ( |40| ) are [cf. (El 
and ( [ 20 I 1 ] 

Re= (y^(I® (’S'Ek))"C^ 

X (CR^C^)-i X C(I® ('i'EK))Y)'\ (41) 

Re = V^ReVf , (42) 


where R„ is the covariance matrix of the stacked vector of noise 
w. In this aggregated case, the same noise models introduced in 
Section IV-B can be present. For the white noise in observations, 
we have that R^ = cr^I, for the white noise in the original 

signal, we have that R.,e = (I 0'S') YY'^ (I 0and 
for the white noise in the active frequency coefficients we have 


that R,e = cr^ (I ( 


(’i'E^))YY" (I( 


(^Ek)) 


H 
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(a) (b) 

Fig. 3: A bandlimited graph signal, (a) The graph Q and the graph 
signal X defined on the nodes of Q. The sampling node is circled in 
red. (b) Frequency components x of the graph signal x. Given that 
there are three non-zero coefficients, the bandwidth of signal x is 3. 
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Economic Sectors 

Fig. 4: Fleat map of the graph-shift operator S of the economic network. 
It is sparse across the real economic sectors (from sector 1 to 62) while 
the artificial sectors AV and FU are highly connected. 


A. Structured observability pattern 

In the previous discussion, no structure was assumed in the 
selection matrix C. A case of particular interest is when the 
sampling schemes are implemented in a distributed manner 
using message passing. Suppose that the sampling is performed 
at node i. To compute yf\ the node i needs to have access to 
^ for all j G Afi and I' < 1. To simplify notation, and without 
loss of generality, we will assume that the sampling node is 
i — 1 and that the neighbors of z = 1 are i = 2 ,..., A^i -F 1 . 
Suppose also that node i = l computes Li shifts, from y^'^ up 
to y[^^\ This implies that node i = 1 has access to Li + 1 of 
its own samples and to Li samples of each of its A^i neighbors. 
The selection matrix C can be written as 


C = 


L, xN 


•-N, 


0 

’EL 


Li + lx(N^-N)) 


0 


NiLix(N^-NNi-N) _ 


■ (43) 


and when it is not (Section VII-Al. We then present results 


Matrix C has 1 + Li(l + A^i) rows, one per observation. The 
first 1 + Li rows correspond to the samples at node i = l and the 
remaining LiNi to the samples at its neighbors. Note also that 
matrix C (I 0 )) Y is not full (row) rank. The reason is 

that all the samples obtained at node i = l, except for the first 
one, are linear combinations of the samples at its neighbors. This 
implies that the number of frequencies that can be recovered 
using (ED is, at most, 1 -F LiNi. 

Structured observation models different from the one in ( |4^ 
can be also of interest. For example, one can consider setups 
where nodes from different parts of the graph take a few samples 
each and forward those samples to a central fusion center. 
In such a case, since the nodes gathering data need not be 
neighbors, the problem associated with some of the samples 
being a linear combination of the others will not necessarily be 
present. 

VII. Numerical experiments 

We start by illustrating the perfect recovery of synthetic noise¬ 
less graph signals, both when the frequency support is known 


A. Noiseless recovery and support selection 

Consider the 20-node undirected graph Q depicted in Fig. 
which corresponds to a realization of a symmetric Erdos-Renyi 
graph with edge probability 0.20 | |^ . With A = VA^V^ 
denoting the adjacency matrix of Q, three different graph-shift 
operators are considered; Si = A, S 2 = I — A, and S 3 = 
0.5A^. Notice that, even though the support of S 3 differs from 
that of Si and S 2 , the graph-shift operator S 3 still preserves 
the notion of locality as defined by a two-hop neighborhood. 
Note also that the three shift operators share the same set of 
eigenvectors V, but they have a different set of eigenvalues. 

Let X be a graph signal supported on Q. This signal is 
represented in Fig. To facilitate interpretation, the value of 
the signal at a given node is written explicitly next to the node 
and also coded by the color of the node. Although seemingly 
random in the node domain, the structure of the signal x is 
highly determined by the graph. To illustrate this. Fig. 3b 
presents the frequency components x of signal x, where the 
graph frequency basis are given by the columns of V. The 
figure reveals that x has a bandwidth of A = 3. Since V is 
the basis for Si, S 2 and S 3 , the frequency representation x and 
the bandwidth K are the same for any of the three operators. 


for real-world graph signals corresponding to the exchange 
among the different sectors of the economy of the United States. 
These are used to test recovery under the presence of noise 
(Section [VII-B| l as well as to illustrate the space-shift sampling 
method (Section [VII-C| l. 


As a result, the procedure described in Section III-C allows 
recovering the whole signal using three aggregated samples, no 
mater which operator is chosen for the aggregation. 

Suppose that we select node z = 4 as sampling node, which is 
circled in red in Fig.|^ If the shift is Si, the 3 first observations 
taken by that node are — [-0.55,1.27,-2.94]^. The first 
observation corresponds to the value of the signal at node 4, 
the second one to the aggregated signal at its neighbors and 
the third observation corresponds to a linear combination of the 
signal values within its two-hop neighborhood. Since K = 3, 
Proposition [T] guarantees recovery if; i) the 3 first eigenvalues 
of the shift operator are distinct and ii) the 3 first values of 
V 4 are non-zero. It turns out that for Si and node 4 these 
two conditions hold true and, hence, the interpolation in ( [T3] l 
yields the original signal in Fig. In fact, for the network at 
hand, these two conditions are satisfied for all nodes and shift 
operators considered. This implies that perfect reconstruction 
in a noiseless setting is achieved independently of which node 
aggregates the information and which shift operator - among 
the three presented - is picked. To better asses the conditions 
in Proposition [T] we build 10,000 different random graphs 
where the edge probability is randomly chosen from the interval 
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Fig. 5: Recovery rate of bandlimited signals in random graphs when the frequency support is unknown. Signals are recovered through the 
norm -1 relaxation of problem for different number of observations K and three different graph-shift operators Si = A (blue), S 2 = I — A 
(red), and S 3 = 0.5A^ (magenta). Random graphs with different edge probahilities were considered: (a) 0.15 , (b) 0.20, and (c) 0.25. The shift 
operator S 3 consistently outperforms the others, which can be attributed to a lower matrix coherence. 


[0.15,0.25]. Realizations that do not give rise to a connected 
graph are discarded. We vary the number of nodes from 10 to 
30 and the active frequencies of the graph signals from 1 to 
5. For each random graph and signal defined on it, we test for 
perfect signal recovery on every node. The simulations show 
that in 99.89% of the cases the signal is successfully recovered. 

The graph signal x can be sampled and recovered even when 
the frequency support is unknown, i.e., when we know that 
X = V“^x contains K = 3 non-zero components, but we 
do not know the indices of these K active frequencies. In 
this case, however, 2K — 6 samples are needed to guarantee 
identifiability (cf. Proposition |^. By solving problem ( |T5l ), the 
signal can be recovered at every node and using any of the three 
shift operators, as in the previous case. However, when solving 
a relaxed version of problem (ID, accurate signal recovery 
depends on the specific network, signal and node selected for 
reconstruction. Moreover, the recovery rate depends on the 
choice of the graph-shift operator S. For example, for the signal 
in Fig. 3a solving a 1-norm relaxation of the problem ( |T5| ) 
yields the original graph signal x if S = A and i = 4, but 
fails if S = I — A and i = 5 where node i = 5 is the 
right neighbor of node z = 4. To assess recovery better. Fig. 
plots the success rate - fraction of realizations for which the 
actual signal was recovered - for graph-shifts Si, S 2 and S 3 , 
and different number of observations. Each point in the plots 
represents an average across all nodes in the network, 5 signal 
realizations and 10 random graph realizations. The three plots 
correspond to symmetric Erdos-Renyi graphs generated using 
different edge probabilities: 0.15, 0.20, and 0.25. The recovery 
rate for S 3 = 0.5A^ is consistently higher than for the other 
shift operators considered. This is not surprising: when squaring 
the adjacency matrix to generate S 3 , the dissimilarity between 
any pair of eigenvalues is increased, which reduces the matrix 
coherence Hi{C) associated with S 3 = 0.5A^ and facilitates 
sparse recovery (cf. last paragraph in Section [V-A[ ). Nonetheless, 
if success rate is the main concern, there exist relaxations of the 
0 -norm that give better results than the 1 -norm used 


B. Recovery in the presence of noise 

The Bureau of Economic Analysis of the U.S. Department 
of Commerce publishes a yearly table of input and outputs 
organized by economic sectors p4) . More precisely, we have 
a set N of 62 industrial sectors as defined by the North Amer¬ 
ican Industry Classification System and a similarity function 


U : Af X Af —i' M+ where represents how much of 

the production of sector i, expressed in trillions of dollars per 
year, was used as an input of sector i' on average during years 
2008, 2009, and 2010. Moreover, for each sector we are given 
two economic markers; the added value (AV) generated and the 
level of production destined to the market of final users (EU). 
Thus, we define a graph on the set of iV = 64 nodes comprising 
the original 62 sectors plus the two artificial ones (AV and EU) 
and an associated symmetric graph-shift operator S defined as 
Sy = {U(i,j) + U[j,i))/2. We then threshold S in order to 
increase its sparsity by setting to 0 all the values lower than 0.01 
to obtain the shift operator S = VAV^, which is normal given 
that it is symmetric; see Eig. Consider the signal x S 
on the mentioned graph where x contains the total production 
- in trillion of dollars - of each sector (including AV and EU) 
during year 2011. Signal x is approximately bandlimited in S 
since most of the elements of x = V^x are close to zero; see 


Eig. 6 a (top). In particular, the reconstructed signal X4 = V4X4 
obtained by just keeping the first four frequency coefficients 
attains a reconstruction error of 3.5 x 10“^ computed as the ratio 
between the energy of the error and the energy of the original 
signal. This small reconstruction error is nonetheless noticeable 
when plotting the original signal x and the reconstructed one 
X 4 ; see Eig. (bottom). To present a reasonable scale for 
illustration, sectors AV and EU are not included in Eig. since 
the signal takes out-of-scale values for these sectors. 

In Sections IVII-BII to IVII-B3I we consider the bandlimited 
signal X 4 as noiseless and add different types of Gaussian noise 
to analyze the interpolation performance at different nodes. 
Differently, in Section |VII-B4| we interpret the whole graph 
signal X as a noisy version of X 4 and analyze the reconstruction 
error when interpolating x from just 4 samples. 

1) White noise in the observed signal: We perform aggrega¬ 
tion sampling of multiple noisy versions of X 4 via successive 
applications of the graph-shift S at different economic sectors 
(nodes). The noisy versions of X 4 are generated by adding noise 
to the observed signal as described in ( |25| ). The power of the 
white noise is the same for all nodes and is computed so that, 
averaging across nodes, the linear signal to noise ratio (SNR) 
for the first, second, third and fourth observations in each node 
is 2, 10, 50, and 250, respectively. This increase in SNR is 
attributable to the fact that successive applications of the shift S 
increase the signal magnitude while we keep the noise power cr^ 


constant. In Eig. 6 b we plot the empirical average reconstruction 
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(d) (e) (f) 

Fig. 6 : (a) Top: Frequency representation of the graph signal x in the basis of eigenvectors of the graph-shift S. The signal is approximately 
handlimited. Bottom; Signal x (blue) and its reconstruction X 4 (magenta) when keeping only the first 4 frequency components, (b) Empirical 
(red circle) and theoretical (blue star) reconstruction errors for different sampling nodes when white noise is added to the observed signal, (c) 
Empirical (red circle) and theoretical (blue star) reconstruction errors when white noise is added directly to the signal X4. (d) Signal X4 (blue) 
and the best reconstruction achieved when sampling an economic sector for the three types of noise considered: white noise in the observations 
(magenta), white noise in the signal (orange) and white noise in the active frequency components (green), (e) Reconstruction errors for different 
sampling nodes when interpolating signal x based on four observations, (f) Signal x (blue) and the best reconstruction (magenta) achieved when 
performing local aggregation sampling of economic sectors. 


error at different nodes across 1,000 noisy realizations of X 4 
and compare it with the theoretical average error, i.e., the trace 
of in @ [cf. ( 1 ^]. We first observe that the computed 
theoretical error indeed coincides with the average empirical er¬ 
ror across realizations. Moreover, notice that the reconstruction 
performance is highly node dependent. The error is minimized 
for the reconstruction based on the artificial sectors AV and FU. 
This is reasonable since these two nodes - unlike other sectors 
- are closely related to every other sector of the economy (cf. 
Fig. 1^1. Furthermore, the sectors achieving the worst reconstruc¬ 
tion errors are ‘Publishing Industries’ and ‘Ground Passenger 
Transportation’ corresponding to nodes 34 and 31. This can be 
explained by analyzing the vectors 1)34 = 'LI 34 E 4 and V 31 (cf. 
Lemma [^ 1 . Even though both vectors have all four components 
different from zero, which guarantees perfect reconstruction in 
the noiseless case (cf Proposition [^l, they possess an element 
whose absolute value is in the order of 10 “^, increasing the 
sensitivity of the reconstruction in the presence of noise. For all 
other nodes the smallest element of Vi is at least one order of 
magnitude larger. Fig. [^presents the reconstruction obtained by 
aggregation sampling in node 46 corresponding to ‘Professional 
Services’ - best among real economic sectors, i.e., excluding 
AV and FU - which achieves an error of 0.26. 

2) White noise in the original signal: Similarly to the 
analysis performed in the previous section, we investigate the 
reconstruction performance of aggregation sampling at different 
nodes. However, in this case, the noise is added to the original 
signal, following the model described in ( |26l l. The power of 
the white noise is set to induce a linear SNR of 10^. As 
was the case in the previous section, the average empirical 


error (across 1,000 realizations) matches closely our theoretical 
estimates; see Fig. 6 c Moreover, the specific nodes that lead 
to a good (bad) interpolation performance are very similar to 
those in the previous noise model. Indeed, sectors 34 and 31 
have the highest reconstruction error whereas AV and FU attain 
the best reconstructions. Fig. 6 d shows the best reconstruction - 
excluding AV and FU - which amounts to an error of 0.001 and 
corresponds to the sector ‘Professional Services’ at node 46. 

3) White noise in the active frequencies: We consider a third 
category of noisy versions of X 4 obtained by adding white noise 
only to the four active frequencies, as described in (|27]l. The 
power of the white noise is set to induce a linear SNR of 
10^. The empirical reconstruction error associated with each 
node - averaged over 1,000 noisy realizations of X 4 - is the 
same among nodes. This validates the analysis in ( |28l l, which 
stated that, for this noise model, the quality of the reconstruction 
is node independent. In Fig. 6 d we present an example of such 
reconstruction, achieving an error of 0 . 01 . 

4) Real-world noisy signal: We interpret the graph signal x 
as a noisy realization of a signal of bandwidth 4. Hence, our goal 
is to obtain the best reconstruction of x based on 4 observations. 
As described in ( |20| ) and shown before, interpolation perfor¬ 
mance is highly node dependent. Indeed, the reconstruction error 
when keeping the first 4 observations at each node spans 5 orders 
of magnitude depending on the sampling node, although for 
most nodes it is contained between 10“^ and 10“^; see Fig. 6 e 


The best reconstruction among the real sectors is achieved 
by ‘Insurance Carriers’ (node 40). The best and the median 
reconstructions are acceptable, attaining errors of 0.0035 and 
0.019, respectively. Fig. [ 6 f| depicts the best reconstruction. 
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Sampling 

Strategy 


Min. error 

Median en'or 

Mi 

[Sx]i 

[S^x]i 

[S^x]i 

.0035 

.019 

Mi 

Mi 

Mfc 

Mi 

.0039 

4.2 

[Sx]i 

[Sx]j 

[Sxjfe 

[Sx]i 

.0035 

.030 

[S^x]i 

[S^x],. 

[S^x]j. 

[S"x], 

.0035 

.0055 

[S^x]i 

[S^x],. 

[S^xjj. 

[S^x], 

.0035 

.0040 

Mi 

[Sx]i 

Mi 

[Sx]j 

.0035 

.039 


TABLE I: Minimum and median reconstruction error - energy of the 
error as a fraction of the energy of the signal x - for different sampling 
strategies. The first sampling strategy corresponds to aggregation sam¬ 
pling, i.e., observing the same node i after successive applications of 0, 
1, 2, and 3 graph-shifts S. The second sampling strategy corresponds 
to selection sampling, i.e., observing the value of the signal x at 4 
different nodes i,j,k,l. The remaining strategies correspond to more 
general space-shift sampling schemes. 


C. Space-shift sampling 


In Section VII-B4 we analyzed the accuracy of interpolating 
the U.S. economic activity after aggregation sampling in differ¬ 
ent economic sectors. The minimum and median reconstruction 
errors are presented in the first row of Table |Ij where the recon¬ 
struction error is quantified as the ratio between the energy of 
the error and that of the original signal. An alternative approach 
is to implement selection sampling, i.e. to sample the signal x in 
4 different sectors - excluding the artificial sectors AV and FU 

- and interpolate the whole signal from these 4 observations, 
as explained in Section |III-B| Recall that reconstruction is not 
guaranteed for every subset of 4 nodes since we must have 
invertibility of (CVi^) [cf. (|^]. By analyzing the minimum and 
median reconstruction errors - see the two first rows in Table U 

- it is clear that the node aggregation sampling outperforms 
the node selection sampling. This is intuitive since most of the 
energy of the signal is contained in the two first frequencies [cf. 
Fig. I^top)], which are associated with the largest eigenvalues. 
Hence, after successive implementations of the graph-shift, the 
error in estimating these frequencies is reduced, resulting in a 
smaller error in the interpolation of the whole signal. 

As developed in Section |Vl] more general sampling strategies 
can be implemented. For example, we can sample the value 
of the signal at 4 nodes after the application of one, two or 
three graph-shifts. The results - listed in rows 3, 4 and 5 of 
Table |I]- reveal that reduction in the median error after each 
graph-shift application is conspicuous, especially when going 
from no applications - median error of 4.2 - to one application 

- median error of 0.03. A different alternative is a sampling 
strategy that selects the original signal and the signal after one 
shift in two different sectors. The results, listed in the last row 
of Table [Ij show that this configuration leads to a very good 
reconstruction performance: 0.0035 minimum error and 0.039 
median error. Note that with this sampling configuration, the 
two sectors are only required to compute the aggregated activity 
of their one-hop neighbors. 


VIII. Conclusions 

A novel scheme for sampling bandlimited graph signals - that 
admit a sparse representation in the frequency domain - was 
proposed. The scheme was based on the aggregation of local 
information at a single node after successive applications of 
the graph-shift operator. This contrasted to most existing works, 
which focus on sampling the value of the signal observed at 
a subset of nodes. Our scheme was shown to be equivalent 


to classical sampling for directed cycle graphs whereas, for 
more general graphs, the Vandermonde structure of the sampling 
matrix was exploited to determine the conditions for perfect 
reconstruction in the absence of noise. Reconstruction under 
correlated noise was analyzed, and design criteria to pick the 
sampling node and shifts leading to optimal noisy reconstruction 
were discussed. Scenarios where the specific set of frequencies 
present in the bandlimited signal is not known were also inves¬ 
tigated and connections with sparse signal reconstruction were 
drawn. Finally, a more general sampling scheme was presented 
which contained, as particular cases, the selection sampling as 
well as our local aggregation approach. The various sampling 
and interpolation scenarios were illustrated through numerical 
experiments in both synthetic and real-world graph signals. 
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